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Abstract — Humans are able to process a face in a variety of 
ways to categorize it by its identity, along with a number of 
other demographic characteristics, including race, gender , 
and age. Experimental results are based on a face database 
containing subjects. Race and gender also play an important 
role in face-related applications. Experimental results are 
indicated that participants categorized the race of the face 
and this categorization drives the perceptual process. A face 
image data set is collected from Internet, and divided into a 
training dataset and a test dataset. Experimental results based 
on a face database containing 250 subjects. The proposed 
system can also be applied to other image-based classification 
tasks. 

Index Terms — race identification, PCA, face recognition 

I. Introduction 

Race refers to classifications of humans into relatively 
large and distinct populations or groups often based on 
factors such as appearance based on heritable phenotypical 
characteristics or geographic ancestry, but also often 
influenced by and correlated with traits such as culture, race 
and socio-economic status. As a biological term, race denotes 
genetically divergent human populations that can be marked 
by common phenotypic traits. Humans are able to process a 
face in a variety of ways to categorize it by its identity, along 
with a number of other demographic characteristics, including 
race, gender, and age. Over the past few decades, a lot of 
effort has been devoted in the biological, psychological, and 
cognitive sciences areas, to discover how the human brain 
perceives, represents and remembers faces. Computational 
models have also been developed to gain some insight into 
this problem. The demographic features, such as race and 
gender, are involved in human face identity recognition. 
Humans are better at recognizing faces of their own race than 
faces of other races [3] [4]. Golbyetal. Show that same-race 
faces elicit more activity in brain regions linked to face 
recognition [5]. They use functional magnetic resonance 
imaging (fMRI) to examine if the same-race advantage for 
face identification involves the fusiform face area (FFA), which 
is known to be important for face recognition [6] . Compared 
to race identification, the gender classification has received 
more attention [7] [8] [9] . Gutta et al [9] proposed a hybrid 
classifier based on RBF networks and inductive decision trees 
for classification of gender and race origin, using a 64*72 
image resolution. They achieved an average accuracy rate of 
92% for the ethnic classification part of the task. Experimental 
results for gender classification in Moghaddam and Yang [7] 
are based on 21*12 image resolution. Shakhnarovich et al 
[10] presented a real-time face detection and recognition 
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system based on a boosted classifier. The same structure is 
used for demographic information extraction, including gender 
and race. Two categories of race are defined, Asian and non- 
Asian. Again, their system is focused on low resolution 
(24824) images with face data weakly aligned. Their reported 
accuracy is about 80%. The other-race effect for face 
recognition has been established in numerous human memory 
studies and in meta-analyses of these studies [1 1], [12], [13]. 
In fact, the other-race effect in humans can be measured in 
infants as a decrease in their ability to detect differences in 
individual other-race faces as early as three to nine months 
of age [13]. 

II. Pre-processing 

The first step of pre-processing is the face region 
extraction. Face region extraction means the input face image 
is extracted from input image by using cropping tool. The 
input color image is converted to gray image and stored in 
database for processing. The input image may be current 
scanned image or realities input image. And then enhancing 
state occurs. The proposed system allows the free size and 
format of color image. Enhancing state is included the noise 
filtering, gray scale converting, and histogram equalization. 
Histogram equalization is mapped the input image's intensity 
values so that the histogram of the resulting image will have 
an approximately uniform distribution. The histogram of a 
digital image with gray levels in the range [0, L- 1 ] is a discrete 
function. 
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where L is the total number of gray levels , r is the k"' gray 
level, n k is the number of pixels in the image with that gray 
level, n is the total number of pixels in the image, and k = 0, 1, 
2, . . . , L - 1. p(j"d is given an estimate of the probability of 
occurrence of gray level r . By histogram equalization, the 
local contrast of the object in the image is increased, especially 
when the usable data of the image is represented by close 
contrast values. Through this adjustment, the intensity can 
be better distributed on the histogram. This allows for areas 
of lower local contrast to gain a higher contrast without 
affecting the global contrast. 
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Figure 1. PCA on Myanmar and Non-Myanmar datasets. (a) 

"average" Myanmar face; (b) top 12 eigenfaces of Myanmar 

dataset;(c) "average" Non-Myanmar face; (d) top 12 eigenfaces of 

Non-Myanmar dataset. 

IH. Feature Extraction 

A Subspace Face Recognition 

The Principal Component Analysis (PCA) can do 
prediction, redundancy removal, feature extraction, data 
compression, etc. Because PCA is a classical technique which 
can do something in the linear domain, applications having 
linear models are suitable. Let us consider the PCA procedure 
in a training set of M face images. Let a face image be 
represented as a two dimensional N by N array of intensity 
values, or a vector of dimension N 2 . Then PCA tends to find 
a M-dimensional subspace whose basis vectors correspond 
to the maximum variance direction in the original image space. 
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This new subspace is normally lower dimensional (M« M 
« N2). New basis vectors are defined a subspace of face 
images called face space. All images of known faces are 
projected onto the face space to find sets of weights that 
described the contribution of each vector. By comparing a 
set of weights for the unknown face to sets of weights of 
known faces, the face can be identified. PCA basis vectors 
are defined as eigenvectors of the scatter matrix S defined as: 



s = x< X i-MKx*~ M y 



(-) 



where /U is the mean of all images in the training set andx is 
the i' h face image represented as a vector i. The eigenvector 
associated with the largest eigenvalue is one that reflects the 
greatest variance in the image. That is, the smallest eigenvalue 
is associated with the eigenvector that finds the least 

variance. A facial image can be projected onto M '(« M ) 
dimensions by computing 



Q= [VjVj ...vy] 1 



Q) 



The vectors are also images, so called, eigenimages, or 
eigenfaces. They can be viewed as images and indeed look 
like faces. Face space forms a cluster in image space and PCA 
gives suitable representation. 

B. Nearest Neighbor Classification 

One of the most popular non-parametric techniques is 
the Nearest Neighbor classification (NNC). NNC asymptotic 
or infinite sample size error is less than twice of the Bayes 
error [15]. NNC gives a trade-off between the distributions of 
the training data with a priori probability of the classes 
involved [14]. KNN (K th nearest neighbor classifier) classifier 
is easy to compute and very efficient. KNN is very compatible 
and obtain less memory storage. So it has good discriminative 
power. Also, KNN is very robust to image distortions (e.g. 
rotation, illumination). Euclidian distance is determined 
whether the input face is near a known face. The problem of 
automatic face recognition is a composite task that involves 
detection and location of faces in a cluttered background, 
normalization, recognition and verification. 

IV. Conclusions 

This paper has addressed the race identification problem 
based on facial images. The Principal Component Analysis 
(PCA) based scheme has been developed for the two-class 
(Myanmar vs. non-Myanmar) race classification task. An 
ensemble framework, which integrates the PCA for the input 
face images at multiple scales, is proposed to further improve 
the classification performance of the race identification 
system. Experimental results based on a face database 
containing 250 subjects are encouraging. The normalized 
classification scores can be used as the confidence with 
which each image belongs to a race class. This confidence is 
helpful to the image-based face recognition, and cross-race 
face recognition. Separating the race factor from the other 
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factors can help the recognition system to extract more 
identity-sensitive features, thereby enhancing the 
performance of the current face identity recognition systems. 
The proposed system can also be applied to other image- 
based classification tasks. Future experiments using racially 
ambiguous faces need to involve participants of other races. 

V. Experimental Results 

A face image data set is collected from Internet, and 
divided into a training dataset and a test dataset. Using face 
detector and face alignment tool, these faces are automatically 
cropped and normalized in grey level and geometry as in [6], 
and each face is manually labeled with an age value estimated 
by human subjectively. The dataset is separated into two 
race groups, Myanmar and Non-Myanmar. The Non-Myanmar 
database is composed of research papers. Most of the 
Myanmar faces are of Kachin, Kayah, etc. origins. These 
face images are contained variations in pose, illumination 
and expression. Sample images from the databases are shown 
in Fig. 3. 
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Figure 3. Representative faces in the database . (a) Myanmar; (b) 
Non-Myanmar. 
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Figure 4. Performance comparison 
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